A novel approach to data deduplication over the engineering-oriented cloud systems

نویسندگان

  • Zhe Sun
  • Jun Shen
  • Jianming Yong
چکیده

This paper presents a duplication-less storage system over the engineering-oriented cloud computing platforms. Our deduplication storage system, which manages data and duplication over the cloud system, consists of two major components, a front-end deduplication application and a mass storage system as backend. Hadoop distributed file system (HDFS) is a common distribution file system on the cloud, which is used with Hadoop database (HBase). We use HDFS to build up a mass storage system and employ HBase to build up a fast indexing system. With a deduplication application, a scalable and parallel deduplicated cloud storage system can be effectively built up. We further use VMware to generate a simulated cloud environment. The simulation results demonstrate that our deduplication storage system is sufficiently accurate and efficient for distributed and cooperative data intensive engineering applications

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Assessment Methodology for Anomaly-Based Intrusion Detection in Cloud Computing

Cloud computing has become an attractive target for attackers as the mainstream technologies in the cloud, such as the virtualization and multitenancy, permit multiple users to utilize the same physical resource, thereby posing the so-called problem of internal facing security. Moreover, the traditional network-based intrusion detection systems (IDSs) are ineffective to be deployed in the cloud...

متن کامل

DoS-Resistant Attribute-Based Encryption in Mobile Cloud Computing with Revocation

Security and privacy are very important challenges for outsourced private data over cloud storages. By taking Attribute-Based Encryption (ABE) for Access Control (AC) purpose we use fine-grained AC over cloud storage. In this paper, we extend previous Ciphertext Policy ABE (CP-ABE) schemes especially for mobile and resource-constrained devices in a cloud computing environment in two aspects, a ...

متن کامل

A Clustering Approach to Scientific Workflow Scheduling on the Cloud with Deadline and Cost Constraints

One of the main features of High Throughput Computing systems is the availability of high power processing resources. Cloud Computing systems can offer these features through concepts like Pay-Per-Use and Quality of Service (QoS) over the Internet. Many applications in Cloud computing are represented by workflows. Quality of Service is one of the most important challenges in the context of sche...

متن کامل

Data Deduplication: A Technique for Efficient Storage in Cloud

Cloud Computing allows users to outsource storage and computation to servers using internet. In this paper we propose a data Deduplication technique to reduce the amount of storage space and to save bandwidth. The convergent encryption technique has been proposed to protect the confidentiality of sensitive data. Our scheme has the feature of access which allows only valid users to share the dat...

متن کامل

A Dynamic Proxy Oriented Approach for Remote Data Integrity checking along with Secure Deduplication in Cloud

In Cloud computing users store data over remote servers instead of computer’s hard drive. This leads to several security problems since data is out of the control of the user. So, to protect against the security attacks and to preserve the data integrity in the cloud, Huaqun Wang et.al proposed proxy oriented remote data integrity checking (RDIC). However, this scheme only focuses on one-way va...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • Integrated Computer-Aided Engineering

دوره 20  شماره 

صفحات  -

تاریخ انتشار 2013